Even though relatively limited efforts have been made towards it, age estimation has a large variety of applications ranging from access control, human machine interaction, person identification and data mining and organization.

Human face is the most essential reflection of age. During the process of aging, changes of face shape and texture will take place.

This project is about estimating human age based on face images. It was splitted into four stages: dataset obtaining and preprocessing, shape and texture feature extraction, model training and age prediction and comparing prediction errors of deferent models.

By milestone, all these four stages have been accomplished, while there are still several future works planed to be done to optimize the model and minimize the prediction error. These works are stated at the end of this paperwork.

The database we are used is FG-NET(Face and Gesture Recognition Research Network), which is built by the group of European Union project FG-NET who work on face and gesture recognition. This dataset contains 1002 images of 82 subjects whose age varies from 0 to 69. Each image was manually annotated with 68 landmark points located on the face. And for each image, the corresponding points file is available. The figure below shows the sample images and the landmarks on a face image.

To simplify the processing and enhance the prediction accuracy by the end, we chose only the images in which people faced the camera directly. Images like Figure 2 were not used. At last, we got 617 satisfying images out of the total 1002 images in the dataset.

Since we need to use the texture on the face as one important feature that predicts the age, we processed all the images into gray scale ones; this is shown in Figure 3.

Each image was rotated until the two eyes reached horizontal; this is shown in Figure 4.

Firstly, go through all the coordinates files to find the shortest distance D between two eyes in each image. Then resize all the images until the eye-distance in each image is equal to D.

Active Shape Model is a statistical model-based image search method, which is modeling through the statistics of the same type of the target object images. The main characteristic of ASM is to apply a statistical method to model on a certain category of target images, limiting the feature search results within the range of variation of the model by introducing the prior knowledge of the target object.
ASM is a two-step process, the first step is training model based on the training data. The second step is to take the advantage of the guidance of the training model to search image to get the object shape information. In our project, both the training data and test data are from a data set of correctly annotated images. The 68 landmarks are placed on the same way on each of a training set and it is done manually. The points can represent the boundary, internal features, etc. So we only implemented the first step. By examining the statistics of the positions of the labeled points a “Point Distribution Model” is derived. And we neglect the second step for now. The “Point Distribution Model” gives us the mean positions of the points and a number of parameters (eigenvalues) which control the main modes of variation found in the dataset.
There are three main steps in modeling the data.

The data set after the data preprocessing stage has 617 sample images of face and each image is annotated with 68 points on the face.
So we get a matrix named Training Data which contains the coordinate of the points and image information.

Before PCA is applied, we aligned the vectors a little further by moving the center of the face to a certain position in each image.
After that, we get the new shape vectors:

The eigenvalues and eigenvectors of the covariance matrix,

represents for the eigenvalues, and

is listed in descending order.

Choose the t largest eigenvalues which can explain the 98% of the variance in the training shapes.

In the above formula,

, which are the shape parameters and the coefficients of the first t models.

To establish the statistical texture model for reflecting the global texture variation of the faces, Firstly, what we need to do is getting all the pixel grayscale values in the face contour area, which can be regard as extracting the shape-independent texture. And then we use the principle component analysis of the shape-independent texture to modeling.

We need to get the grey value of each pixel in the texture image in a fixed sequence to generate a vector

where n is the number of the pixels. To compensate for the illumination, we need to do the texture normalization processing. The so-called normalization is to generate gray vector of zero mean and 1 variance.

We introduced Delaunay Triangulation method to divide the mean shape into a collection of triangles.We introduced Delaunay Triangulation method to divide the mean shape into a collection of triangles.

Figure7. Result for triangulation of face sample
Firstly, each face in the training set is deformed into mean shape units each triangle. After triangulation, each face image is composed of a collection of triangular mesh, therefore, by deformation of the shape of each triangle the deformation of whole face image to the mean shape model can be achieved. And by this way, the appearance sample can be obtained in the normal form of mean shape model. Therefore, we can get the shape-independent texture feature.

Figure8. Example result for deformation to mean shape

Like the ASM, AAM also use principle component analysis to build a statistical model of texture. As a result, we can get the eigenvalue, eigenvectors and the mean texture of the appearance model, and after PCA, each shape vector can be written as a linear combination of mean texture and the eigenvectors.

For AAM’s original shape positioning algorithm is not precise enough, ASM cannot take advantage of the texture information, we use a combination of ASM and AAM. After obtaining the shape model and appearance model, each image can be described by a set of texture parameters and shape parameters, because AAM is built based on the shape model, so there is correlation between shape parameters and texture parameters. So it is not proper to combine the models directly, so we combine the two sets of parameters by a weight matrix.
Finally, PCA is used to extract the age feature from the combined model.

Figure 10. Comparison of original texture with the texture made by combined model.

(a) Original texture. (b) Texture made by the combined model. (c) Difference between original texture and texture described by the combined model.

Specifically, epsilon-SVR, one kind of SVR, is used to predict the age of a person from one’s face image.

It introduces parameter epsilon to measure the cost of the errors on the training points. It sets epsilon intensive band to ignore the error of all points that are inside the band.

At this moment, this is no more than the normal SVM classification method. And it will be optimized for this problem in our future work.

After converting the label to classes index, C-SVC is used to estimate the age range of a person from a face image. The age ranges are 0-5, 5-10, 10-15, ... ,65-70 for the 5-year-a-class model and 0-10, 10-20, 20-30, ... ,60-70 for 10-year-a-class.

indicator	result
iteration	203
nu	0.792000
obj	-404.595448
rho	-14.999804
nSV	396
nBSV	396
Squared correlation coefficient	0.0710749
Mean squared error	121.736
Standard Error	8.8885 (years)

indicator	5-year-a-class	10-year-a-class
iteration	16	20
nu	0.266667	0.700000
obj	-3.982239	-13.909430
rho	0.997459	0.989280
nSV	14	17
nBSV	2	7
Total nSV	500	500
Accuracy	29.9145% (35/117)	33.3333% (39/117)
Average Misclassification Error	1.4359	0.8462

To implement C-SVC and epsilon-SVR, the SVM functions Library libsvm^[10]is referenced"

Why is the result of SVR in a sense quite acceptable while SVC performed poorly in this problem? Can we find another way?

The reason is that SVC treats all misclassifications equally while it should not. Apparently, one person of age 12 is more costly to be predicted to be in class 80-90 than in 0-10. Therefore, we plan to add a misclassification error factor to SVC to see if it will enhance the performance of the model and how much is the improvement if exists.

Another way to improve the prediction of SVC is dividing the classification process into 2 parts. The first one tells whether the face belongs to a youth or an adult. In the second phrase, the system will use SVC with different parameters to determine the specific age range.

2. Two phrases classification (as showed at the end of the comparison of SVC and SVR)

3. Using n-fold or leave-one-out to choose the best values of the parameters for SVR to obtain lower mean square error.

4. If time permits, we will use Matlab to make an GUI application. The expected result is after inputting a picture without any feature points marked, the program can tell the person’s estimated age.

[1]Xin Geng, Zhi-Hua Zhou, Kate Smith-Miles (2007). Automatic Age Estimation Based on Facial Aging Patterns. Pattern Analysis Machine Intelligence, 29(12), 2234-2240.

[2]Unsang Park, Yiying Tong, Anil K.Jain (2010). Age-Ivariant Face Recognition. Pattern Analysis and Machine Intelligence, 32(5), 947-954.

[3]Ramanathan, N., Chellappa, R., & Biswas, S. (2009). Age progression in human faces: A survey. Visual Languages and Computing.

[6]Hsu, C. W., Chang, C. C., & Lin, C. J. (2009). A practical guide to support vector classification, 2003. Paper available at http://www. csie. ntu. edu. tw/~cjlin/papers/guide/guide. pdf.

[7]Van Ginneken, B., Frangi, A. F., Staal, J. J., ter Haar Romeny, B. M., & Viergever, M. A. (2002). Active shape model segmentation with optimal features. Medical Imaging, IEEE Transactions on, 21(8), 924-933.

[9]Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(6), 681-685.

[10]Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm